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Abstract —  Energy  management  in  Hybrid  Electric  Vehicles  (HEV)  has  been  actively  studied  recently  because  of  its  potential  to 
significantly  improve  fuel  economy  and  emission  control.  Because  of  the  dual-power-source  nature  and  the  complex  configuration  and 
operation  modes  in  a  HEV,  energy  management  is  more  complicated  and  important  than  in  a  conventional  vehicle.  Most  of  the  existing 
vehicle  power  optimization  approaches  do  not  incorporate  knowledge  about  driving  patterns  into  their  vehicle  energy  management 
strategies.  Our  approach  is  to  use  machine  learning  technology  combined  with  roadway  type  and  traffic  congestion  level  specific 
optimization  to  achieve  quasi-optimal  energy  management  in  hybrid  vehicles.  In  this  series  of  two  papers,  we  present  a  machine 
learning  framework  that  combines  Dynamic  Programming  with  machine  learning  to  learn  about  roadway  type  and  traffic  congestion 
level  specific  energy  optimization,  and  an  integrated  online  intelligent  power  controller  to  achieve  quasi-optimal  energy  management  in 
hybrid  vehicles.  These  two  papers  cover  the  modeling  of  power  flow  in  HEVs,  mathematical  background  of  optimization  in  energy 
management  in  HEV,  machine  learning  algorithms  and  real-time  optimal  control  of  energy  flow  in  a  HEV. 

This  first  paper  presents  our  research  in  machine  learning  for  optimal  energy  management  in  HEVs.  We  will  present  a  machine 
learning  framework,  ML_EMO_HEV,  developed  for  the  optimization  of  energy  management  in  a  HEV,  machine  learning  algorithms 
for  predicting  driving  environments  and  generating  optimal  power  split  for  a  given  driving  environment.  Experiments  are  conducted 
based  on  a  simulated  Ford  Escape  Hybrid  vehicle  model  provided  by  Argonne  National  Laboratory's  PSAT  (Powertrain  Systems 
Analysis  Toolkit).  Based  on  the  experimental  results  on  the  test  data,  we  can  conclude  that  the  neural  networks  trained  under  the 
ML_EMO_HEV  framework  are  effective  in  predicting  roadway  type  and  traffic  congestion  levels,  in  predicting  driving  trend  and  in 
learning  optimal  engine  speed  and  optimal  battery  power  from  Dynamic  Programming. 
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I.  Introduction 

Development  of  new  vehicles  with  high  fuel  efficiency  and  less  emissions  has  become  a  new  focus  of  research  in  the 
automobile  industry  due  to  growing  energy  and  environmental  concerns.  Hybrid  Electric  Vehicles  (HEVs)  have  emerged  as  a 
promising  advanced  technology  to  improve  fuel  economy  while  meeting  the  tightened  emissions  standards.  The  improved  fuel 
economy  of  HEVs  is  achieved  by  optimizing  the  architecture  and  the  various  devices  and  components  of  the  vehicle  system,  as 
well  as  the  energy  management  strategy  that  is  used  to  efficiently  control  the  energy  flow  through  the  vehicle  system.  In  this 
research  we  focus  on  the  issue  of  optimizing  vehicle  energy  management  to  improve  fuel  economy. 

An  HEV  combines  two  or  more  energy  sources,  e.g.,  internal  gasoline  combustion  engine  (ICE)  and  battery,  in  its  propulsion 
system  to  move  the  vehicle.  With  the  use  of  a  secondary  power  source,  an  HEV  uses  a  smaller  and  more  efficient  engine  in  its 
drivetrain.  Because  of  the  dual-power-source  nature,  the  design  and  implementation  of  an  HEV  system  is  a  challenging  problem. 
The  power  control  strategy  that  splits  the  power  between  chemical  fuel  and  stored  electricity  takes  an  important  role  in  the  overall 
fuel  efficiency  and  amount  of  emissions.  The  goal  of  the  power  control  strategy  is  to  minimize  the  total  fuel  consumption  and 
emissions  without  sacrificing  vehicle  performance,  safety,  and  reliability.  In  order  to  meet  these  challenges,  it  is  very  important  to 
optimize  the  architecture  and  the  various  devices  and  components  of  the  vehicle  system,  as  well  as  the  energy  management 
strategy  that  is  used  to  efficiently  control  the  energy  flow  through  the  vehicle  system. 

Current  existing  real-time  power  control  strategies  are  largely  based  on  heuristic  control  rules/fuzzy  logic  for  control  algorithm 
development.  Wipke  et  al.  used  a  strategy  that  adopts  a  rule-based  structure  in  the  control  logic  by  defining  a  set  of  thresholds 
through  an  optimization  process  [1].  Jeon  et  al.  proposed  a  rule-based  multi-mode  driving  control  strategy  for  parallel  HEVs  that 
uses  an  algorithm  that  is  optimized  for  a  recognized  driving  pattern  [2],  Zhu  et  al.  implemented  a  fuzzy  rule-based  power 
controller,  in  that  the  fuzzy  rules  are  extracted  by  studying  the  optimization  result  for  the  given  cycle  [3]-[4].  Schouten  et  al. 
implemented  a  load-leveling  and  charge-sustaining  strategy  by  using  a  fuzzy  logic  technique  [5],  These  heuristic  rules/fuzzy 
logic-based  strategies  mostly  stem  from  engineering  intuition,  which  is  sometimes  far  from  the  actual  optimal  solution. 

An  alternative  approach  is  to  apply  an  optimal  control  method  such  as  linear  programming  [6],  optimal  control  [7],  and 
especially  dynamic  programming  (DP)  [8]-[10]  to  the  power  distribution  and  management  problem.  In  general,  these  techniques 
require  the  knowledge  of  the  entire  drive  cycle  in  advance.  Therefore  they  do  not  offer  an  online  solution.  Furthermore,  an 
optimal  power  split  solution  for  a  given  specific  drive  cycle  might  be  neither  optimal  nor  charge-sustaining  under  other  cycles.  To 
address  these  issues,  a  number  of  techniques  have  been  proposed.  Paganelli  et  al.  used  an  instantaneous  optimization  method  that 
reduces  the  problem  to  a  minimization  of  equivalent  fuel  consumption  at  each  time  instance  [11].  If  only  the  present  state  of  the 
vehicle  is  considered,  the  optimization  of  the  operating  points  of  the  individual  components  can  still  be  beneficial,  however  the 
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benefits  will  be  limited  [11]-[13].  Lin  et  al  proposed  a  stochastic  dynamic  programming  (SDP)  method  in  an  attempt  to  obtain  the 
optimization  for  general  driving  conditions,  rather  than  a  specific  driving  cycle,  using  power  demand  probability  [14], 

Recent  research  has  shown  that  the  current  driving  environment  and  the  driver's  driving  style  have  a  strong  influence  over  a 
vehicle's  fuel  consumption  and  emissions  [15]-[19].  Driving  patterns  exhibited  by  a  real  world  driver  are  the  product  of  the 
instantaneous  decisions  of  the  driver  to  respond  to  the  (physical)  driving  environment.  Specifically,  varying  roadway  type  and 
traffic  congestion  level,  driving  trends,  driving  styles,  and  vehicle  operating  modes  have  various  degrees  of  impact  on  fuel 
consumption.  However  most  of  the  existing  vehicle  power  optimization  approaches  do  not  incorporate  knowledge  about  driving 
patterns  into  their  vehicle  energy  management  strategies.  Our  approach  is  to  use  machine  learning  technology  combined  with 
roadway  type  and  traffic  congestion  level  specific  optimization  to  achieve  quasi-optimal  energy  management  in  hybrid  vehicles. 
It  is  our  contribution  to  use  the  current  driving  environment  to  predict  the  future  driving  conditions,  train  an  online  energy 
management  system  using  machine  learning  to  emulate  the  optimal  solutions  generated  by  Dynamic  Programming  (DP)  for 
specific  roadway  types  and  traffic  congestion  levels,  and  generalize  the  optimal  power  settings  to  real  world  vehicle  operation 
based  on  the  predicted  real-time  roadway  types  and  traffic  congestion  levels. 

This  paper,  the  first  in  a  series  of  two,  presents  our  research  in  the  development  of  a  machine  learning  framework, 
ML_EMO_HEV,  for  the  optimization  of  energy  management  in  an  HEV.  In  the  ML_EMO_HEV  framework,  algorithms  are 
developed  to  learn  energy  optimization  based  on  long  and  short  term  knowledge  about  the  driving  environment.  The  long  term 
knowledge  about  the  driving  environment  is  represented  by  the  type  of  the  drive  cycle  the  driver  is  in  for  the  next  few  minutes. 
The  short  term  knowledge  is  the  driver's  immediate  reaction  to  the  driving  environment  at  each  time  instance.  In  the  second  paper 
of  the  series,  we  will  present  the  intelligent  online  energy  controller  developed  under  the  framework  of  ML_EMO_HEV  and 
trained  by  the  machine  learning  algorithms  presented  in  this  paper  to  minimize  the  fuel  consumption  while  maintaining  vehicle 
performance. 

The  paper  is  organized  as  follows.  Section  II  introduces  an  HEV  model  and  energy  optimization  in  an  HEV.  Section  III 
presents  the  machine  learning  technologies  we  developed  for  optimal  vehicle  energy  management.  Section  IV  concludes  this 
paper. 

II.  ENERGY  OPTIMIZATION  IN  AN  HEV 

The  energy  management  problem  for  HE  Vs  is  a  dynamic  optimization  problem.  In  the  discrete  time  format,  the  dynamics  of  an 
HEV  system  can  be  defined  by  the  state  transition  equation: 


x(t  +  1)  =  f(x(t),  u(t),  t) 


(1) 
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where  u(t)  is  the  vector  of  control  variables  such  as  engine  speed,  engine  power  and  battery  charging  or  discharging  power,  and 
x(t)  is  the  state  variable  in  the  HEV  system,  which  is  represented  by  the  battery  state  of  charge  level.  The  HEV  system  can  be 
interpreted  as  taking  the  action  of  control  variables  at  time  t,  u(t )  at  the  given  state  x(t)  and  transforming  the  vehicle  to  the  next 
state  x{t  +  At ),  At  >  0 .  The  energy  optimization  of  the  HEV  can  be  formulated  as  follows: 

min  F(x, u)  subject  to  C(x, li )  <  0  (2) 

u 

where  F(x,U )  is  the  objective  function  (or  cost  function),  C(x,  U )  is  the  constraints  on  variables.  Because  our  goal  of 
vehicle  energy  management  is  to  minimize  the  total  fuel  cost  over  a  given  drive  cycle,  the  objective  function  F(x,u) 
represents  the  accumulated  fuel  cost: 

N 

F(x,u)  =  y'jiiel  _  rate(x(t),u(t),t)At  (3) 

(=i 

where  N  is  the  horizon  or  number  of  times  that  the  control  is  applied,  X  =<  ,t'(l),...,  x(N)  >  and  U  =<  u{  1),...,  u(N)  > .  The 
energy  optimization  problem  is  solved  using  Dynamic  Programming  (DP)  based  on  the  dynamics  of  the  power  split  HEV. 

II.  1  Power  flow  in  an  HEV  vehicle 

Figure  1  shows  power  flow  in  a  power  split  HEV  configuration,  in  which  conventional  definitions  of  signs  for  motor,  generator, 
and  battery  power  are  represented  by  the  arrows.  Pgen,  Pgene,  Pm,,„  Pmo,_e,  Pbatt  and  /\  can  flow  in  both  directions,  where  Pgen  is 
the  (mechanical)  generator  power,  Pgen  e  is  the  electrical  generator  power,  Pmo,  is  the  (mechanical)  motor  power,  Pmote  is  the 
electrical  motor  power,  Pban  is  the  power  output  of  the  battery  and  Ps  is  the  internal  battery  power.  A  power  flows  in  the 
direction  of  an  arrow  is  positive,  otherwise  it  is  negative.  When  the  power  of  an  electric  machine  flows  in  the  positive  direction,  it 
indicates  that  the  electric  machine  is  motoring;  otherwise,  is  generating. 

The  output  power  from  the  engine,  Peng  can  be  split  into  the  power  at  ring  gear,  Prmg  and  the  power  at  generator,  Pgen.  The  ring 
gear  power,  Pring ,  represents  the  mechanical  power  flow  path  from  the  engine  to  the  ring  gear  to  the  final  drive.  The  generator 
power,  Pgen ,  represents  an  electrical  path  from  the  engine  to  the  generator  to  the  motor  to  the  final  drive.  The  split  of  the  engine 
output  power  between  the  mechanical  path  and  the  electrical  path  is  accomplished  by  controlling  the  engine  speed  with  the 
generator.  The  electrical  motor  draws  the  power  from  the  battery  and  propels  the  vehicle.  The  two  power  paths  provide 
propulsion  power  to  the  final  drive  to  move  the  vehicle  forward  simultaneously  or  independently  [21]. 

The  battery  used  in  this  research  is  a  Nickel  Metal  Hydride  (NiMH)  battery.  As  shown  in  Figure  1,  the  battery  model  is 
represented  by  a  battery  efficiency  component  and  an  energy  storage  component.  The  battery  power  is  defined  as  positive  when  it 
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Figure  1 .  Power  flow  in  a  Split  HEV  configuration  with  arrows  indicating  positive  power. 


is  charging  and  as  negative  when  it  is  discharging.  The  battery  efficiency  component  determines  the  energy  loss  during  the 
charging  and  discharging  in  the  battery.  The  internal  battery  power,  Ps  is  defined  as  Ps(t)  =Voc(T,SOC)  *I(t ),  where  Voc  is  the  open 
circuit  voltage  in  the  battery  model,  T  is  the  battery  temperature,  and  7  is  the  battery  current.  The  current,  7  is  calculated  by 
j  1  \  yo  |  2  |  p,  1 ,  where  Rreslst  is  the  battery  internal  resistance.  Based  on  the  equation  above,  the  open 

2x7?«s,-„l  J 

circuit  voltage  Voc  varies  depending  on  SOC  and  the  temperature.  But  in  reality,  Voc  in  NiMH  batteries  varies  only  a  little  within 
the  SOC  range  20-80%  [22], 

The  energy  storage  block  keeps  track  of  the  energy  level  in  the  battery.  The  charge  level  of  the  battery,  Q  is  represented  by  the 
integration  of  the  current,  7 

t 

Q(t)  =  Q(0)  +  YJI(k)Ak.  C4) 

k=0 

Because  the  proposed  battery  model  is  power  based  and  available  SOC  range  is  40-60%,  the  battery  energy  level.  Eft)  [J]  is  used 
as  a  state  variable  in  the  energy  control  problem.  The  battery  energy  level,  E( t)  is  defined  as  below 

t 

E(t)  =  E(0)  +  J^Psfk)Ak-  (5) 

k= 0 

The  power  split  HEV  system  allows  two  degrees  of  freedom  in  energy  optimization,  which  we  can  represented  by  two  control 
variables,  the  engine  speed,  coeng  and  the  battery  power,  Pba„.  The  control  variables  at  time  t  are  defined  as 
lift)  =  (Ph„n(t),OJen  (f))  ■  Once  the  values  of  two  control  variables  are  obtained,  then  the  speed  and  power  of  the  other  components 

(the  engine,  the  motor,  and  the  generator)  can  be  determined  based  on  the  kinematics  and  dynamics  of  the  power  split  HEV 
system.  The  optimization  is  subject  to  the  individual  components  constraints  in  the  system:  engine,  generator,  motor  and  battery. 
The  operating  range  of  each  component  is  limited  in  the  energy  optimization.  The  inequality  constraints  for  each  component  are 
introduced  to  the  limit  minimum  and  maximum  power  flow  defined  as  follows: 
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V?e[l,A?] 
V?  e  [l,iV] 

P  (t)<P  (t)<P  ( t )  co  (t)<a>  (t)<a>  (t)  V?e[l,iV] 

mot  min  v1'/  mot  ^ '  mot  max  ^ mot  mm  ^motA*/  ^mot  max  ^  ' 


0  ^  Peng  {t)  ~  Peng_ -  (0  ,  0  <  ®eng  (?)  <  ®OTg  nBX  (?) 

Pgen_ min  (0  ^  ^  V««  ®*«_nin  (0  ^  ®g«(0  ^  «gen_,rax(0 


(6) 


^  batt  _vcm  (0  *  4A0  *  ^batt_  max  (0  V?  e  [1,  ?V] 

where  P-_„„„  is  the  minimum  power  of  the  corresponding  component,  and  P*_max  is  the  maximum  power  of  the  corresponding 
component.  Similarly,  <u  is  the  minimum  speed  of  the  corresponding  component,  and  co*_max  is  the  maximum  speed  of  the 
corresponding  component. 

In  order  to  create  a  well-posed  problem,  an  end  point  constraint  is  imposed  on  the  state  variable,  E,  requiring  the  energy  level 
at  the  end  of  the  given  drive  cycle  to  be  the  same  as  the  initial  energy  level 


E(N)  =  E(  0)  =>  (k)Ak  =  0. 


(7) 


By  combining  these  inequality  constraints  and  the  end  point  constraint  with  the  objective  function  we  can  ensure  that  the  engine 
speed,  the  battery  energy  level,  engine  power,  generator  power  and  motor  power  are  all  within  their  corresponding  boundaries  in 
the  optimal  solution. 

In  the  HEV  energy  management  problem,  the  objective  of  DP  is  to  find  the  optimal  sequence  of  control  variables, 
U  —<  ll( N )  >,  that  minimizes  the  accumulated  fuel  cost  over  a  given  drive  cycle.  Since  the  objective  is  to  minimize 

fuel  cost,  the  calculation  of  the  instantaneous  fuel  consumption  based  on  the  given  control  variables  is  critical.  To  this  end,  a 
nonlinear  static  engine  efficiency  map  provided  by  the  manufacturer  is  used  to  describe  the  relationship  between  fuel 
consumption  and  the  state  and  control  variables.  Specifically,  the  instantaneous  fuel  cost  is  a  function  of  engine  speed  and  engine 
power,  denoted  as  &e„g  (Peng,  coeng).  Assuming  the  moment  of  inertia  of  the  engine  is  negligible  and  the  vehicle  speed,  V  s ,  vehicle 

electrical  load,  PL  ,  and  the  driver’s  power  demand,  Pdme_sh  are  known,  the  instantaneous  engine  power  Peng  for  the  given  control 
variable  u(t)  =  (Phau(t),OJmi,(t))  can  be  calculated  based  on  the  kinematic  equations  of  the  HEV  as  follows: 


Step  1 :  Calculate  the  motor  speed,  comot  given  the  current  vehicle  speed  V  s 

COmo,=G)Hng=  T - X  fd  -  rati0 

wheel  _  r 

where  wheel_r  is  the  effective  wheel  radius  and  fdjratio  is  the  final  drive  ratio. 
Step  2:  Calculate  the  generator  speed,  cogen  and  the  sun  gear  speed,  cosun 


CO „ 


:  Oh  , 


■  Oh, 


Pi  ring  +  N, 

N„ 


N  ■ 

sun  ^  T  ring 


N„ 


(8) 


(9) 
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where  Nsun  is  number  of  teeth  of  the  sun  gear  and  Nrlng  is  number  of  teeth  of  the  ring  gear. 

Step  3:  Calculate  motor  power  loss,  Pmotjoss  and  generator  power  loss,  Pgenjoss.  These  power  losses  depend  on  the  motor  and 
the  generator  efficiencies  given  motor  and  generator  torque  and  speed. 

Step  3.1:  Calculate  the  desired  engine  power,  P  at  steady  system  operation  using  the  given  battery  power  P/Mlr.  the  driver’s 


power  demand,  Pdnve_sh  and  the  electrical  load  as: 


P  -  p  +  p  +  p 

eng  drive  _sh  batt  L‘ 


(10) 


This  desired  engine  power,  p‘  is  the  engine  power  at  the  steady  system  and  it  is  used  to  calculate  the  generate 
torque  zgen  and  motor  torque  zmot  at  the  next  step. 

Step  3.2:  Calculate  the  generator  torque  zgen  and  motor  torque  zmot.  The  desired  engine  torque,  z\  is  first  calculated  by 


T  =  P  /  COene  .  Then  the  estimated  generator  torque,  zgen,  and  ring  gear  torque,  zring,  are  calculated  as  follows: 

N.„ 


eng  eng  eng 

N„ 


Tgen  N^+N, 


ring  * 

Tring  ~  ~N~  +  N„  Z 


(11) 


ring  sun  ring  sun 

The  motor  torque,  zmot  is  calculated  based  on  the  fact  that  the  driver’s  power  demand,  Pdnve_sh,  in  the  power  split 
HEV  is  equal  to  the  power  at  ring  gear,  Pring  plus  the  motor  power,  Pmo, ,  i.e. 

^~drive_ sh  ^ring  ^mot  ^ rin goring  ^ ring^~^ mot^^niot 

^ 'mot  (^*drive_sh  ^  tynot')  ^ ring  (12) 

Step  3.3:  Using  efficiency  maps  of  the  generator  and  the  motor,  calculate  generator  power  loss,  Pgenjoss  and  motor  power 
loss,  PmotJoss  as  follows: 


^gen_loss  gerSP^gen*'^ gen)  ^niol  _  loss  ^ImoSP^mot^ mot) 

where  rjgen  is  a  generator  efficiency  map  and  rjmot  is  a  motor  efficiency  map. 
Step  4:  Calculate  the  final  engine  power,  Peng. 

P  —  P  4-  P  4-  P  4 -  P  4 -  P 

eng  drive_sh  batt  mot_loss  gen_loss  L’ 


(13) 


(14) 


With  two  control  variables,  battery  power,  Pbatt  and  engine  speed,  coeng  the  corresponding  engine  power  is  calculated  using 
equations  (8)-(  14).  Then  the  instantaneous  fuel  rate  can  be  expressed  as  a  function  of  the  control  variables 
u(n=(PhJt),C0eng(t))  as  below: 

Peng  (PengftX  coeng(t))  =  fiiel  _rate(Phall(t),Ci>mg(t)  I  Pdrive  sh(t),PL(t ))  •  (15) 

The  objective  function,  F(x,  u  )  for  the  power  split  HEV  optimization  is  then  expressed  in  equation  (16). 

Fix,  Ti)  =  ^  fuel  _  rate{Pbatt(t),  comg  (?)  I  Pdrive  sh (?),  PL)  (16) 


II. 2.  Energy  Optimization  in  an  HEV  using  Dynamic  Programming 
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The  energy  optimization  problem  in  an  HEV  for  a  given  drive  cycle,  V(t),  can  be  considered  as  a  problem  of  optimization  of  a 
sequence  of  dynamic  states.  Dynamic  Programming  (DP)  is  used  to  find  the  optimal  control  variables  at  every  time  step  of  the 
drive  cycle  as  shown  in  equation  (17). 

1 v 


min  F(x,u) 

u 


min 


Z 

1  t= i 


fuel  _  rate(Pban(t),  a>  (t)  I  Pd, 


ft). 


(17) 


where  the  driver's  power  demand,  Pdnve_sh  if)  is  a  function  of  V(t).  The  sampling  time.  At  for  the  HEV  control  problem  is 
selected  to  be  1  second  because  the  SOC  changes  slowly  and  1  second  sampling  time  is  sufficient.  The  DP  optimization  algorithm 
is  a  multi-step  decision  process.  Based  on  the  principal  of  optimality,  DP  finds  the  sequence  of  optimal  battery  power,  Pbaft),  and 
engine  speed,  coeng(f),  values  that  minimize  the  total  fuel  consumption  over  the  entire  drive  cycle  V(t)  while  satisfying  all 
constraints.  In  implementation,  we  build  a  cost  to  go  matrix,  R ,  (see  Figure  2)  based  on  the  battery  energy  level,  E,  and  engine 
speed,  coeng,  in  the  temporal  domain.  The  two  control  variables,  Pba„  and  coeng,  and  the  state  variable,  E,  are  quantized  into  grids. 
In  Figure  2,  the  engine  speed  is  discretized  into  31  different  engine  speeds  and  is  labeled  as  an  engine  index  i,  i=  1,...,  31.  Here 
engine  index  1=0  radius/sec  and  engine  index  31  equals  the  maximum  engine  speed. 


Cost  matrix  R 


Figure  2.  Projection  of  cost  matrix  R  along  the  driving  cycle. 


The  cost  matrix,  R,  with  state  entries  (E,  coeng  )  keeps  track  of  the  minimum  fuel  cost  from  the  current  time,  f,  to  the  end  of  the 

given  drive  cycle  for  each  combination  of  states.  Then,  according  to  Bellman’s  principle  of  optimality,  the  optimal  decision  at  the 

t'h  step  is  made  based  on  the  formula  below: 

R{t,E(t),coeng(t))  =  0,  if  t  =  N, 

R(t,  E(t),coeng(t)) 

=  min  | R(t  +  ],E(t  + 1), 0)mg (t  + 1))  +  fuel  _ rate(PhJt), coeng (t)  I  Pdrm,h  (t), P,  (())  |,  t  = 

‘batt^eng 

subject  to 

fL.mn  (0  ^  PbJ0  *  Pha„  rm*  (0  ’  a>eng_  nin  0 )  ^  ^  V®  E <°eng  (0)  £  R '  (f  +  1,  E{t  +  1),  G)mg  (t  +  1))  £  R. 
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The  recursive  equation  is  solved  backwards  from  t  =N  to  1  to  get  the  optimal  solution.  The  sequences  of  optimal  battery  power, 
Pba„,  and  optimal  engine  speed,  coeng,  that  minimizes  the  total  fuel  consumption  are  given  afterwards  by  starting  at  E(l)  and 
following  the  path  of  minimal  cost  stored  in  R. 

The  minimization  of  the  overall  fuel  consumption  through  the  appropriate  split  of  mechanical  power  from  the  engine  and 
electrical  power  from  the  battery  is  the  most  critical  part  in  HEV  energy  management.  Varying  the  power  split  ratio  between  the 
mechanical  and  electrical  paths  throughout  the  drive  cycle  can  result  in  significantly  different  fuel  economy.  The  HEV  model 
under  our  study  is  a  power  split  power  train  system,  which  is  used  in  both  Ford  Escape  hybrid  and  Toyota  Prius.  The  power  train 
of  power  split  HEV  consists  of  an  engine,  generator,  motor,  battery  and  planetary  gear  set  [20-21]. 

In  the  power  split  HEV  system,  two  power  sources  are  connected  to  the  wheels  through  a  planetary  gear  set.  One  power  source 
is  the  engine,  and  another  is  the  battery.  The  combination  of  an  engine  and  a  generator  can  provide  power  to  the  driveline  either 
through  an  electrical  path,  a  mechanical  path  or  through  a  combination  of  the  two.  The  combination  of  battery,  motor,  and 
generator  provides  power  to  the  driveline  using  the  battery  power.  Depending  on  the  operation  mode,  either  the  engine  or  the 
motor  or  both  can  provide  the  traction  power  to  the  drivetrain.  During  vehicle  deceleration,  the  regenerative  braking  power  is 
captured  to  charge  the  battery.  In  the  power  split  HEV  powertrain,  the  planetary  gear  set  is  the  key  device  that  connects  the 
engine,  generator,  and  motor  to  form  a  power  split  device.  Due  to  the  kinematic  property  of  the  planetary  gear  set,  the  engine 
speed  can  be  decoupled  from  the  vehicle  speed  [21],  This  flexibility  in  the  power  split  HEV  system  is  one  of  degrees  of  freedom 
that  can  be  exploited  in  the  optimization.  The  focus  of  this  research  is  the  development  of  machine  learning  technology  to 
optimize  energy  consumption  over  a  drive  cycle  with  two  control  variables,  battery  power  and  engine  speed,  or  engine  power  and 
engine  speed.  The  proposed  machine  learning  algorithms  require  the  use  of  a  high  fidelity  vehicle  system  modeling  and 
simulation  program,  such  as  PSAT  (Powertrain  Systems  Analysis  Toolkit),  to  build  an  authentic  vehicle  model,  V,  of  particular 
interest. 


III.  MACHINE  LEARNING  OF  OPTIMAL  POWER  CONTROL  IN  AN  HEV 
The  DP  optimization  of  the  power  split  system  described  in  section  II  assumes  that  the  entire  drive  cycle  V(t),  t  =1,...,  N,  is 
known  a  priori.  However,  since  knowledge  of  the  future  driving  speed  is  not  known  during  real  world  driving,  we  cannot  directly 
apply  the  DP  optimization  approach  in  an  online  energy  management  solution.  Instead,  our  approach  is  to  predict  the  driving 
condition  in  the  near  future  that  can  affect  the  vehicle  energy  management  and  use  this  information  to  calculate  and  apply  the 
optimal  energy  management  solution.  We  developed  a  machine  learning  strategy  to  learn  the  optimal  power  split  settings  for  a  set 


of  standard  drive  cycles  and  then  generalized  the  knowledge  to  online  energy  management  through  neural  learning.  Figure  3 
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illustrates  the  proposed  machine  learning  framework,  ML_EMO_HEV.  It  contains  two  major  machine  learning  processes: 
machine  learning  for  driving  environment  prediction  and  machine  learning  for  optimal  energy  management.  Specifically,  the 
framework  first  uses  a  neural  network  to  model  the  road  environment  of  a  driving  trip  as  a  sequence  of  different  roadway  types 
and  traffic  congestion  levels  such  as  local,  freeway,  arterial/collector,  etc.,  augmented  with  different  traffic  congestion  levels. 
This  part  of  the  framework  uses  an  additional  neural  network  to  model  the  driver's  instantaneous  reaction  to  the  driving 
environment.  Then,  based  on  the  current  predicted  roadway  type  and  traffic  congestion  level  and  driving  trend,  the  framework 
uses  an  additional  set  of  neural  networks  to  emulate  the  optimal  energy  management  strategy  as  dictated  by  DP  for  the  current 
conditions,  in  a  way  that  can  be  implemented  in  an  online  environment. 

III.  1  Machine  learning  of  the  driving  environment 

A.  Neural  learning  for  prediction  of  roadway  types  and  traffic  congestion  levels 

To  represent  real  world  driving  conditions.  Sierra  Research,  Inc.  has  developed  a  set  of  11  standard  drive  cycles  presented  in 
[23]-[24],  called  facility-specific  (FS)  cycles.  These  cycles  represent  passenger  car  and  light  truck  operations  over  a  broad  range 
of  facilities  and  congestion  levels  in  urban  areas.  In  this  research  we  use  this  set  of  11  FS  cycles  as  the  standard  measure  of 
roadway  types  and  traffic  congestion  levels.  For  the  convenience  of  description  we  label  these  11  FS  cycles  as  Rj,...,  If/. 


Figure  3.  ML-EMO-HEV:  a  computational  framework  for  machine  learning  of  optimal  energy  management  in  HEV. 
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TABLE  I 

STATISTICS  OF  1 1  FACILITY  SPECIFIC  DRIVE  CYCLES 


Drive  Cycle 

Vavg  (m/s) 

Vmax  (m/s) 

Amax  (m/s2) 

Length  (sec) 

Freeway  LOS  A:  Ri 

30.29 

35.54 

1.02 

399 

Freeway  LOS  B:  R2 

29.94 

35.01 

1.30 

366 

Freeway  LOS  C:  R3 

29.74 

35.19 

1.52 

448 

Freeway  LOS  D:  R4 

29.16 

34.66 

1.30 

433 

Freeway  LOS  E:  R5 

25.56 

33.26 

1.79 

471 

Freeway  LOS  F:  R<, 

14.58 

28.53 

1.79 

536 

Freeway  Ramps:  R7 

15.46 

26.90 

2.55 

266 

Arterials  LOS  A-B :  Rs 

11.08 

26.32 

2.23 

737 

Arterials  LOS  C-D:  R9 

8.58 

22.12 

2.55 

629 

Arterials  LOS  E-F:  Rio 

5.18 

17.83 

2.59 

504 

Local :  Rn 

5.77 

17.11 

1.65 

525 

Table  I  shows  the  most  recent  definition  of  these  roadway  types  and  traffic  congestion  level  [24]  along  with  the  labels  we 
assigned,  where  Vm,g  is  the  average  vehicle  speed,  Vmax  is  the  maximum  vehicle  speed,  and  Amax  is  the  maximum  acceleration.  The 
1 1  drive  cycles  are  divided  into  four  categories  of  roadway  types  and  traffic  congestion  levels,  freeway,  freeway  ramp,  arterial, 
and  local.  Two  of  the  categories,  freeway  and  arterial,  are  further  divided  into  subcategories  based  on  a  qualitative  measure  called 
level  of  service  (LOS)  that  describes  operational  conditions  within  a  traffic  stream  based  on  speed  and  travel  time,  freedom  to 
maneuver,  traffic  interruptions,  comfort,  and  convenience.  Six  types  of  LOS  are  defined  with  labels,  A  through  F,  with  LOS  A 
representing  the  best  operating  conditions  and  LOS  F  the  worst.  Each  level  of  service  represents  a  range  of  operating  conditions 
and  the  driver’s  perception  of  those  conditions  [24] -[25], 

With  the  above  definition  of  standard  drive  cycles,  the  problem  of  optimal  vehicle  energy  management  is  formulated  as 
follows.  Assume  that  at  any  time  t  for  a  given  drive  cycle  DC(t),  ( t  e  [0,  te],  where  te  is  the  ending  time  of  the  drive  cycle),  the 
vehicle  is  operating  according  to  one  of  the  11  roadway  types  and  traffic  congestion  levels,  i  =  1,  ...,  11.  These  roadway  types 
and  traffic  congestion  levels  will  form  the  basis  for  calculating  the  optimal  energy  management  strategy  using  DP  and  will  also  be 
used  as  the  basis  for  the  online  neural  network  implementation  of  the  DP  emulation. 

We  formulate  the  problem  of  roadway  type  and  traffic  congestion  level  prediction  as  follows.  Let  Rl\t]  be  the  roadway  types 
and  traffic  congestion  levels  the  driver  needs  to  go  through  to  complete  his  trip,  with  1  =  0,  1,  ...,  tc,  te  where  tc  is  the  current 
time  instance,  and  te  is  the  time  when  the  trip  ends.  At  any  given  time  tc,  RT(tc )  €E  {  R,  I  i  =  1,  ...,  11}.  We  will  predict  the 
roadway  type  and  traffic  congestion  level  in  the  near  future  based  on  the  short  term  history  of  the  driver  during  the  trip. 

In  order  to  predict  the  roadway  type  and  traffic  congestion  level  at  time  tc,  we  use  the  driving  speed  in  the  segment  [tc  -  A  WKr, 
4],  Here  the  positive  value  AWrt  is  the  window  size  of  the  segment  used  for  making  the  roadway  type  and  traffic  congestion 
level  prediction.  The  prediction  is  made  at  time  steps,  kAta,  k  =  1,  2,  ...  and  is  used  for  calculating  the  optimal  energy 


management  strategy  over  the  time  period  \tc,  tc+  At  A.  The  time  interval  over  which  the  prediction  is  made  is  A t  .  Figure  4 
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illustrates  these  two  parameters  on  the  speed  profile  of  the  UDDS  drive  cycle.  The  x-axis  represents  the  time  during  the  drive 
cycle  and  y-axis  represents  the  vehicle  speed  in  miles  per  hour.  For  purpose  of  illustration,  the  segments  shown  in  Figure  4  have 
an  equal  size  of  A  Wrt  =  150  seconds  and  a  time  step  of  A t  =  100  seconds.  Please  note  that  A  Wrt  =  150,  A tn  =  100  seconds  are 

chosen  here  only  for  the  clarity  of  the  illustration.  In  reality,  as  we  will  show,  A t  should  be  much  smaller  than  100  seconds. 

The  two  parameters  are  important  for  the  accuracy  of  the  prediction  and  real-time  implementation.  Since  features 
characterizing  road  types  and  traffic  congestion  levels  are  extracted  from  the  speed  profile  of  the  vehicle  in  the  time  interval  [ tc  - 
A  Wrt,  t„] ,  if  A  Wrt  is  too  small,  the  segment  may  not  contain  sufficient  information.  If  A  Wrt  is  too  big,  the  segment  may  contain 
obsolete  information.  Based  on  our  extensive  study  on  these  two  parameters  [26],  A  Wrt  =50  seconds  and  Atrt=l  second  are 
chosen.  Our  study  clearly  showed  that  systems  uses  Atr,= 1  give  significantly  better  performances  over  the  larger  time  intervals. 
A  Wrt  =  50  is  chosen  because  it  gives  shorter  delay  at  the  beginning  of  the  drive  cycle  and  is  computationally  more  efficient, 
which  is  important  for  online  control.  Once  A  Wrt  is  determined,  the  14  features  presented  in  TABLE  II  are  extracted  from  the 
speed  profile  within  the  time.  We  conducted  extensive  study  on  the  effectiveness  of  the  window  size,  various  drive  cycles,  we 
determined  that  A  Wrt  =50  seconds  and  /)/,,=  1  second  are  an  appropriate  window  size  and  prediction  time  interval,  respectively. 

We  developed  a  multi-layered  and  multi-class  neural  network,  NN_RT&TC,  for  the  prediction  of  roadway  types  and  traffic 
congestion  levels  as  shown  in  Figure  5.  The  performance  of  the  neural  network  on  the  5-fold  cross  validation  is  95%  on  the 
training  data  and  94%  on  the  test  data.  Detailed  training  and  testing  data  selection,  feature  selection  algorithms  and  training 
process  are  presented  in  [26].  When  NN_RT&TC  is  used  inside  a  vehicle  to  predict  the  roadway  type  and  traffic  congestion  level 
at  time  tc,  the  vector  of  the  14  features  is  extracted  from  the  vehicle  speed  during  the  time  interval,  [tc-50  seconds,  tc]. 


Figure  4.  Dlustration  of  segments  of  a  speed  profile.  The  X  axis  represents  time  measured  in  seconds,  and  the  Y  axis  represents  speed  measured  in  mph. 
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TABLE n 

FOURTEEN  FEATURES  SELECTED  FOR  ROADWAY  TYPE  AND  TRAFFIC  CONGESTION  LEVEL  PREDICTION 


Name  of  selected  features: 

Description 

Trip  distance; 

AS  is  distance  traveled  between  t-  AWRT  and  t 

Maximum  speed; 

max(vs(t)),  where  vs(t)  is  speed  and  t  £  \t  —  AW ^  ,t) 

Maximum  acceleration; 

rnax(  a+ )  ,  where  a*  is  acceleration  and  t  e  [f  -  A W^. ,  t) 

Maximum  deceleration 

max(l  Cl~  1)  ,  where  a~  is  deceleration  and  t  e  [t  -  AW^J) 

Average  speed 

ave(vs(0):  average  of  vs(t),  and  fep-AW^.O 

Average  acceleration 

ave(  a *  ):  average  of  a*  and  /  e  [f  -  A ,  t ) 

S.  D.  of  acceleration 

■J  var (flf+  )  and  t  e  [t  -  AW,„  ,  t) 

Average  deceleration 

ave(  a~  >  and  te[t-  AWff ,  t) 

%  of  time  in  speed  interval  0-15  km/h 

P  (vs  (f)  1  0  <  v,  (f)<15)  for  a"  t  g  [f  -  AWrt  ,t) 

%  of  time  in  speed  interval  15-30  km/h 

p  (v,  (f)  1 15  <  v,  (f)  <  30)  for  all  t  £  [f  -  AW gj ,  t) 

%  of  time  in  speed  interval  >110  km/h 

p(vjt)\  vs{t)  >110)  for  all  t  e[f  — AWflj.,?) 

%  of  time  in  deceleration  interval  (-10)-(-2.5)  m/s2 

p  (a;  1-10  <  a;  <  -2.5)  for  all  t  e  [t  -  A WRT ,  t) 

%  of  time  in  deceleration  interval  (-2.5)-(-1.5)  m/s2 

P  (a;  1  -2.5  <  a ~  <  -1.5)  for  all  tg[t- AWm,t) 

Number  of  acceleration  shifts  where  the  acceleration  is  0.5~1  m/s2 

Number  (a*  \  ().5  <  a*  <  1.0)  for  all  t  &[t  -  AW 

The  output  from  NN_RT&TC  is  the  roadway  type  and  traffic  congestion  level  to  be  used  by  an  intelligent  vehicle  energy 
management  algorithm  to  determine  the  optimal  power  distribution  during  time  interval  [tc,  tc+l seconds].  Figure  6  shows  an 
example  of  a  drive  cycle,  LA92,  labeled  with  the  actual  roadway  types  and  traffic  congestion  levels  for  the  cycle  according  to  the 
definition  of  the  11  standard  FS  roadway  types  and  traffic  congestion  levels  as  defined  in  [24],  The  X  axis  indicates  the  time  and 
the  Y  axis  indicates  the  vehicle  speed  in  miles  per  hour.  The  prediction  results  generated  by  the  neural  network  NN_RT&TC  are 
shown  in  the  blue  color.  Notice  there  is  a  delay  in  the  prediction  for  the  first  50  seconds  due  to  the  need  for  the  algorithm  to  have 
at  least  one  window  of  data  available  for  use  in  the  prediction. 

The  prediction  results  of  NN_RT&TC  for  LA92  along  with  9  other  cycles  from  PSAT  are  shown  in  TABLE  III.  The 
percentages  given  are  the  prediction  accuracy  of  the  neural  network  calculated  as  follows:  NNp  =(NC/N)  *100%,  where  Nc  is  the 
number  of  times  during  the  drive  cycle  that  the  neural  network  makes  a  correct  prediction  of  the  FS  roadway  type  and  traffic 
congestion  level,  and  N  is  the  number  of  predictions  made  by  the  neural  network  through  the  entire  cycle.  The  prediction 
accuracy  is  high:  more  than  90%  of  time,  the  roadway  type  and  traffic  congestion  levels  are  predicted  correctly  on  all  drive 
cycles.  On  six  drive  cycles,  the  prediction  accuracy  reached  more  than  95%.  The  errors  are  caused  by  the  time  delay  of  the 


prediction,  which  is  based  on  the  features  extracted  from  a  window  of  past  vehicle  speed. 
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Figure  5.  Neural  learning  for  roadway  types  and  traffic  congestion  levels  prediction. 


Simulation  plot 


Figure  6.  Neural  Network  performance  of  roadway  types  and  traffic  congestion  levels  prediction  for  LA92. 


TABLE m 

THE  PREDICTION  PERFORMANCES  OF  THE  NEURAL  NETWORK  OVER  TEST  DRIVE  CYCLES 


Drive  cycles 

Neural  network 
prediction 
performance:  NNP 
(%) 

Drive  cycles 

Neural  network 
prediction 
performance  NNP 
(%) 

UDDS 

97.04 

REP05 

95.85 

HWFET 

95.38 

NY_CITY 

98.72 

US06 

95.09 

HL07 

95.96 

SC03 

90.99 

UnifOl 

94.65 

LA92 

91.77 

Arb02 

91.63 
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B.  Neural  learning  for  predicting  driving  trends 

The  neural  network  NN_DT  (see  Figure  7),  is  developed  for  predicting  the  driving  trend  at  any  given  time  instance  t.  The 
driving  trend  is  a  short  term  action  taken  by  the  driver  in  the  next  few  seconds.  The  NN_DT  is  trained  on  the  following  six 
features  from  the  current  vehicle  state  during  the  time  window  [t  —  AWDT,t)  ■  vave,  vltm„  vmin,  ac,  vst  and  vend,  where  the  first  four 
parameters  are,  respectively,  the  average  speed,  maximum  speed,  minimum  speed  and  average  acceleration,  during  the  time 
period  [r-  AWDI  ,f) ,  vs,  is  the  vehicle  speed  at  (t  —  AWnT) ,  and  vend  is  the  vehicle  speed  at  t.  We  define  vehicle  driving  trends 
into  five  classes  shown  in  TABLE  IV.  The  quantitative  descriptions  are  used  to  automatically  label  the  segments  in  training 
driving  cycles,  which  are  the  1 1  Sierra  Research  facility-specific  driving  cycles.  The  NN_DT  algorithm  is  a  multi-class  neural 
network  with  6  input  nodes,  one  hidden  layer  and  5  output  nodes  to  represent  the  five  classes  of  driving  trends  at  time  interval 

At dt  =1  (one-step  prediction).  In  order  to  decide  on  the  optimal  size  of  AWDT,  we  experimented  with  various  sizes  on  training 
and  test  driving  cycles  and  the  results  are  shown  in  Figure  8.  The  cycles  used  for  testing  the  driving  trend  prediction  were  the  10 
cycles  provided  by  the  PSAT  simulation  system.  Based  on  the  performances  on  both  training  and  test  data,  AWDT=9  seconds  is 
selected  as  a  good  window  size  to  use  because  it  gave  better  performances  compare  to  the  performance  of  smaller  window  sizes 
and  similar  performances  to  the  performance  of  larger  window  sizes  such  as  AWDI  =15,  30,  50  seconds. 


TABLE  IV 

FIVE  CLASSES  OF  DRIVING  TRENDS 


Driving  Trend 
classes 

description 

Quantitative  description 

0 

No  speed 

sp  =  0 

1 

Low  speed  cruise 

0<spave<58.65  ft/s  &  0.5<aave<0.5ft/s2 

2 

High  speed  cruise 

spave>58.65  ft/s  &  0.5<aave<0.5ft/s2 

3 

Acceleration 

aave>0.5  ft/s2 

4 

Deceleration 

aave<0.5  ft/s2 
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Figure  8.  Performances  of  driving  trend  prediction  neural  network  on  various  window  sizes. 


III. 2  Machine  learning  of  optimal  energy  management 

A.  Power  split  optimization  using  DP  for  each  Facility  Specific  drive  cycle 

We  applied  the  DP  algorithm  described  in  Section  II  to  every  Facility  Specific  (FS)  drive  cycle,  Rh  i  =  1 , _ ,1 1 ,  to  find  the 

optimal  power  settings  associated  with  those  roadway  types  and  traffic  congestion  levels.  The  algorithm  requires  the  use  of  a  high 
fidelity  vehicle  system  modeling  and  simulation  program,  such  as  PSAT  or  ADVISOR.  Two  major  steps  in  the  algorithm  require 
the  use  of  such  a  simulation  program.  First  we  used  the  simulation  program  to  build  a  model  of  a  particular  vehicle  of  interest. 
Second,  we  run  the  vehicle  model  in  the  simulation  program  to  generate  step-by-step  system  state  data:  P drive-  ,h(t),  PL(t )  and  vs(t), 
t=  1,  ...  ,  N,  for  every  FC  drive  cycle. 

A  fuel  rate  matrix  is  generated  for  all  possible  combinations  of  the  two  control  variables,  battery  power,  Phan,  and  engine  speed, 
to en!l,  within  the  specific  upper  and  lower  bounds  of  Pba„  and  coeng,  denoted  as  Pbatt_minii)  <  Pbatff)  <  Pbatt_mJf )  and  ojeng  mm(t)  < 
cOengit)  <  to  eng  maxif-  The  fuel  rate  matrix,  fuel_rate(Pba„(t),  coeng(t )  I  Pd,ive_sh(t),  Pdf)),  is  generated  for  each  time  step  t  as  a 
function  of  Pbat,(t)  and  coeng(t)  for  the  given  drivetrain  power  Pdrm-sdf),  and  the  electric  load  power  P f  t)-  Then  the  DP 
optimization  program  described  in  Section  II  is  applied  to  every  one  of  the  1 1  standard  FS  drive  cycles  to  obtain  the  optimal 
sequence  of  the  two  control  variables,  engine  speed,  coeng  and  battery  power,  Pbat,  pertaining  to  each  drive  cycle.  Figure  9 
summarizes  the  major  computational  steps  in  generating  optimal  operating  points  at  every  time  step  for  every  FS  drive  cycle. 

Figure  10  shows  the  optimal  Pbatt  and  engine  speed  coeng  generated  by  the  DP  for  the  Arterial  LOS  CD  drive  cycle  and 
compared  with  the  output  generated  by  the  default  controller  in  the  Ford  Escape  provided  by  PSAT.  The  detailed  description  of 
this  vehicle  model  will  be  presented  at  the  Part  II  of  this  paper  series.  Table  V  shows  the  performances  of  DP  on  all  11  Sierra 
drive  cycles.  For  the  purpose  of  comparison,  we  also  listed  the  performance  of  the  Ford  Escape  controller  provided  by  the  PSAT 
simulation  model  on  these  drive  cycles  as  well.  Since  we  cannot  control  the  ending  SOC  for  the  Ford  Escape  controller  in  PSAT, 
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the  ending  SOC  values  vary  from  cycle  to  cycle.  In  order  to  make  a  fair  comparison,  we  re-calculated  the  DP  fuel  cost  with  an 
SOC  correction.  Specifically,  we  reran  the  Dynamic  Programming  process  by  starting  at  50%  SOC  and  forcing  the  DP  program 
to  end  at  the  same  SOC  of  the  Ford  Escape  controller  in  PSAT. 

It  is  clear  that  the  fuel  savings  from  DP  control  is  quite  significant,  ranging  from  8.95%  ~  16.80%.  However  we  need  to  point 
out  that  for  in-vehicle  implementation,  DP  cannot  be  used  for  real-time  control  since  it  requires  the  knowledge  of  the  entire  drive 
cycle.  Furthermore,  implementation  of  an  optimal  energy  management  in  either  Ford  Escape  or  Toyota  Prius  needs  to  be  traded- 
off  with  other  vehicle  attributes  such  as  drivability,  emission  and  OBD  to  make  trade-offs.  So  DP  result  only  serves  as  an  upper 
bound  of  energy  optimization  in  a  vehicle  for  a  given  drive  cycle. 


Step  1 :  Bui  Id  a  vehicle  model  using  PSAT  Simulation  software 


3 


vehicle  model 


Step2:For  each  standard  FS  «, drive  cycle  i,i=  1,  11 


Step  3:  Generate  driving  info  data  on  ff,  with  the  vehicle  model  using  PSAT 


|  Pdrh.e_sh(t ),  PL(t),  Vs(t),  , 


,N 


Step  4:  Set  the  Pbatt  and  coeng  boundaries  by  incorporating  with  engine  and  battery  constraints 


Pbatt—minif)  and  P batt—maxif)  ■> 
C&  eng_min(f)  and  CO eng_maxif) 


Figure.  9.  Computational  steps  of  DP  optimization  in  a  HEY. 
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Figure  10.  DP  results  with  Ford  Escape  model  in  PSAT  for  Arterial  LOS  CD  cycle. 


TABLE  V 

PERFORMANCE  OF  DP  OPTIMIZATION  ON  THE  1 1  SIERRA  DRIVE  CYCLES 


Cycle 

Cycle  Time(s) 

Fuel  Consumption  (g) 

Saving  by  DP  (%) 

DP 

Ford  Escape 

Freeway  LOS  A:  Ri 

400 

747.99 

895.64 

16.48 

Freeway  LOS  B:  Ri 

367 

655.11 

778.23 

15.82 

Freeway  LOS  C:  R3 

449 

795.40 

947.02 

16.01 

Freeway  LOS  D:  R4 

434 

737.74 

884.63 

16.60 

Freeway  LOS  E:  R5 

472 

619.80 

744.98 

16.80 

Freeway  LOS  F:  R6 

537 

302.62 

350.84 

13.74 

Freeway  Ramps:  R7 

270 

198.16 

237.98 

16.73 

Arterials  LOS  A-B :  Rs 

738 

329.72 

381.83 

13.65 

Arterials  LOS  C-D:  R9 

630 

229.45 

262.57 

12.61 

Arterials  LOS  E-F:  Rio 

504 

131.32 

147.24 

10.81 

Local :  R11 

526 

133.63 

146.75 

8.95 

B.  Neural  network  training  of  optimal  power  solutions  for  Facility  Specific  drive  cycles 

Two  sets  of  neural  networks  ( NN'„  ,  NN‘  )  have  been  developed  to  learn  the  optimal  power  split  generated  by  the  Dynamic 

‘bat  ’  & eng 

Programming  for  each  of  the  11  roadway  types  and  traffic  congestion  levels,  R„  /=  1 , . . . ,  1 1 .  The  neural  network,  AW'.,  predicts 
Pbat„  the  optimal  battery  power,  and  the  neural  network,  and  aw'  predicts  the  optimal  engine  speed  coeng  for  the  roadway  type 

and  traffic  congestion  level  If.  The  input  variables  to  NN‘P  are  Vj  ( t ),  P dnve-shif),  DT(t),  and  SOC(t ),  where  Vi  (f)  is  the  vehicle 

‘bat 

speed,  P drive_sh(f)  is  the  driver’s  power  demand,  DT{t)  is  the  driving  trend,  and  SOC(t)  is  the  state  of  charge  of  the  battery.  I)T(t) 
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can  take  one  of  the  five  states:  no  speed,  low  speed  cruise,  high  speed  cruise,  acceleration  and  deceleration.  The  input  variables 


to  tvTV'  are  Vy  (t),  Pdrive.sh(t),  and  SOC(t). 

o'-* 

Figure  11  shows  the  architecture  of  the  two  neural  networks,  NN'Pi  and  NN'rg  ■  For  each  standard  FS  drive  cycle,  R„  i  =  1, 
11,  the  training  data,  Q„  for  the  two  neural  networks  are  generated  by  the  procedure  described  in  the  last  subsection.  Here  Qj 
=  {  v'(0.  PLe-AO,  DT‘{t),SOC(t),  PUt)  ’  <„(0  1  t  =  !.  where  v'(0,  t),  DT‘(t)’SOC‘(t)  are  the  vehicle 

speed,  driver  power  demand,  driving  trend  and  battery  state  of  charge  at  time  t  respectively,  and  P,‘mll(t) ,  G)'  (t)  are  the  optimal 

battery  power  and  engine  speed  settings  generated  by  the  DP  algorithm  at  time  t  for  drive  cycle  R,.  The  variable  N1  is  the  length 
of  Rj.  The  neural  networks  as  trained  for  different  FS  drive  cycles  can  have  different  numbers  of  hidden  nodes.  The  Performance 
of  the  NN  training  is  measured  by  Mean  Squared  Errors  (MSE)  defined  as: 


1  N 

MSE=  — y  (output (t)  - tar(t ))2 , 

N  t= i 


(19) 


where  output(t)  is  the  NN  output  and  tar(t)  is  the  truth  target  value.  Table  VI  shows  the  performance  table  in  terms  of  MSE  of  the 
neural  networks  MN‘  and  as  compared  to  the  DP  data  for  each  of  1 1  Sierra  FS  cycles,  R„  1=1, ..,11. 


Training  until  error  <« 

f'-.  Neural  Network 
Training 

. fccnftMnc  Ljvct 


Figure  11.  The  energy  management  neural  networks,  NN'p  and  NN'(0  f°r  Sien'a  FS  cycle  Rj,  i=l,..,l  1. 
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Battery  Power  (Freeway  LOS  C) 


100  150 

Engine  Speed  (Freeway  LOS  C) 


Figure  12.  Comparison  of  optimal  engine  speed  and  battery  power  generated  by  the  neural  networks  and  DP  for  Freeway  LOS  C. 


TABLE  VI. 

PERFORMANCE  OF  NEURAL  NETWORKS  TRAINED  FOR  INTELLIGENT  ENERGY  MANAGEMENT 


Cycle 

NNpba.t  (MSE) 

NNweng  (MSE) 

Cycle 

NNpba„  (MSE) 

NNweng  (MSE) 

Freeway  LOS  A 

0.0012 

0.0009 

Ramp 

0.0004 

0.0009 

Freeway  LOS  B 

0.0015 

0.0012 

Local 

0.0004 

0.0002 

Freeway  LOS  C 

0.0012 

0.0006 

Arterial  LOS  AB 

0.0007 

0.0007 

Freeway  LOS  D 

0.0008 

0.0006 

Arterial  LOS  CD 

0.0005 

0.0003 

Freeway  LOS  E 

0.0011 

0.0007 

Arterial  LOS  EF 

0.0007 

0.0006 

Freeway  LOS  F 

0.0007 

0.0005 

All  1 1  cycles 

0.0013 

0.0008 

Figure  12  shows  the  battery  power,  Pbatt,  (top  graph)  and  engine  speed,  coeng,  (bottom  graph)  generated  by  the  trained  neural 
networks  and  DP  on  a  R3  drive  cycle,  i.e.  Freeway  LOS  C.  It  can  be  observed  that  the  results  generated  by  the  neural  networks 
are  very  close  to  the  results  generated  by  DP,  which  is  an  optimal  algorithm,  but  cannot  be  implemented  for  real-time  operation. 

In  the  Part  II  of  this  paper  series,  we  will  present  an  intelligent  energy  controller  that  uses,  in  real-time,  the  NN_RT&TC  to 
predict  the  current  roadway  type  and  traffic  congestion  level,  NN_DT  to  predict  the  driving  trend,  and  then,  assume  the  predicted 
roadway  type  is  use  the  two  energy  control  neural  networks,  NN'P  and  NN'a  trained  for  the  roadway  type  R,  to  generate 


the  optimal  battery  power  and  engine  speed,  throughout  the  drive  cycle. 
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IV.  CONCLUSION 

We  presented  a  machine  learning  framework,  ML_EMO_HEV,  for  the  optimization  of  energy  management  in  an  HEV.  This 
framework  includes  machine  learning  algorithms  for  predicting  roadway  types  and  traffic  congestion  levels  and  driving  trends 
and  then  using  these  predicted  values  in  another  algorithm  that  learns  optimal  energy  settings  based  on  the  predicted  roadway 
types  and  traffic  congestion  levels  and  driving  trends.  The  neural  network,  NN_RT&TC,  was  designed  and  trained  for  the 
prediction  of  roadway  types  and  traffic  congestion  levels.  It  is  a  multi-class  neural  network  trained  to  predict  which  of  the  1 1 
standard  FS  drive  cycles  the  current  roadway  type  and  traffic  congestion  level  belongs  to.  Its  performance  on  10  test  drive  cycles 
has  accuracy  within  the  range  of  91.6%  and  98.7%.  The  neural  network  that  predicts  the  driving  trend,  NN_DT,  predicts  one  of 
five  classes  of  driving  trend:  no  speed,  low  speed  cruise,  high  speed  cruise,  acceleration  and  deceleration.  Its  performance  for  a 
9  second  window  is  approximately  94%  in  accuracy  for  both  training  and  test  data.  For  each  of  the  1 1  Sierra  FS  drive  cycles, 
two  neural  networks  were  trained,  one  to  emulate  the  optimal  engine  speed  generated  by  DP  and  the  other  the  optimal  battery 
power.  The  performance  of  the  neural  networks  for  generating  the  optimal  engine  speed  over  all  1 1  roadway  types  and  traffic 
congestion  levels  have  an  MSE  ranging  between  0.0002  and  0.0012.  The  neural  networks  for  generating  the  optimal  battery 
power  have  an  MSE  ranging  between  0.0004  and  0.0015.  In  the  second  paper  in  the  series,  we  will  present  an  intelligent  online 
power  controller  developed  under  ML_EMO_HEV,  and  its  performances  in  a  target  vehicle  under  various  training  conditions  and 
drive  cycles. 
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